ENUMERATING TREES 



ROBERT A. KUCHARCZYK 



Abstract. In this note we discuss trees similar to the Calkin- Wilf tree, a binary tree that 
enumerates all positive rational numbers in a simple way. The original construction of Calkin 
and Wilf is reformulated in a more algebraic language, and an elementary application of methods 
from analytic number theory gives restrictions on possible analogues. 



Contents 



1. The Calkin- Wilf Tree 

2. The Monoid SL 2 (N ) 

3. Injective Families 

4. Heights on P 1 and the Distribution of Points 

5. Constraints on Injective Families 
References 



1 

3 
6 
9 
12 
14 



1. The Calkin- Wilf Tree 

In j6], Neil Calkin and Herbert Wilf introduced a remarkably beautifuO way to enumerate the 
positive rational numbers, drawing together several observations by Stern [16] and Reznick (12] . 
The enumeration is along a binary tree in the sense of computer science, i.e. an infinite rooted 
tree in which each node has two children^, one of which is called "left" and the other "right" . 
This naming should be considered not just as a device for drawing the tree, but rather as part 
of the mathematical structure. 

Here comes its construction. The nodes of the tree are labelled by positive rational numbers. 
For ease of notation, we write each such number in the form - with p, q G N \ {0} coprime. 
The rule for labelling is recursive: the tree's root is labelled by ~. If a node is labelled ~, then 
its left child bears the label and its right child bears the label £±2. By induction we directly 
see that these are reduced fractions as written. 

Before proving and stating the basic properties of this tree, we encourage the reader to 
contemplate Table [1] where the first few layers are shown. 



The author acknowledges financial support by the Max-Planck-Institut fur Mathematik in Bonn. 
1 It was considered worthy by the authors of II] to be included into their BOOK. 

2 By the recursive procedure for constructing the tree, it seems natural to use the family metaphor in this 
direction. Since this is the usual terminology, we stick to it. The reverse direction would be somewhat more 
fitting, though, since (at least by the current state of art in reproductive medicine) everybody has precisely two 
parents, one of which is "male" and one of which is "female" . But to produce children, you need a partner, and 
their number is generally not fixed to two. In either direction, an infinite chain appears problematic, although 
there can be little doubt that Thomas Aquinas would have preferred an infinite sequence of children. 
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Table 1 . The first five layers of the Calkin- Wilf tree 
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Proposition 1.1 (Calkin- Wilf). In the Calkin- Wilf tree, every positive rational appears ex- 
actly once. 

Proof. For ease of parlance, we confuse nodes with their labels. 

Writing a positive rational as p/q with p, q coprime positive integers, we proceed by induction 
on m = max(p, q). For m = 1 there is only p = q = 1 to consider. The rational number 1/1 = 1 
does appear in the tree, namely at its root; it cannot occur anywhere else, since each left child 
p/(p + q) is smaller than 1 and each right child (p + q)/q is bigger than 1. 

Assume now that the statement is proved for all m < mo, and let x = p/q with max(p, q) = 
mo. Then either x < 1 or x > 1. In the first case, we have mo = q > p, hence x is the left child 
of the (by assumption) unique node labelled p/(q — p), and since it cannot be a right child (else 
x > 1), it cannot occur at any other place. Similarly, if x > 1, it must be a right child, and it 
must be the right child of (p — q)/q which, by assumption, does occur exactly once. □ 

The proof already shows that the position of a positive rational p/q can be determined by 
performing the Euclidean algorithm on p and q. It is also clear that the continued fraction 
expansion of p/q and the sequence of left / right moves one has to make from 1 in order to get 
to p/q are easily translated into one another. 

So, if we write down the first line, then the second line, then the third line of the Calkin- Wilf 
tree and so on, we obtain a list of the positive rationals in which each of them appears exactly 
once, i.e. a bijection No — > Q>o- As can be checked from Tabled], this list begins with 

1121323143525341547385727583745 

w r 2' r 3' 2' 3' r v 3' 5' r 5' 3' i' r 5' v r 3' 8' 5' 7'2'7'5'8'3'7'4'5'T'''' 

The attentive reader will long have noticed that the denominator of each term is equal to the 
numerator of its successor. This can easily be proved by induction. Hence there must be a 
function / : No — > N such that f(n) and f(n + 1) are coprime, and the n-th element of the 
sequence is equal to f(n)/ f(n + 1). It is proved in [BJ that f(n) is the number of ways to 
partition n into powers of two, each power occurring at most twice. 
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Moshe Newman also has found a simple recursive construction of the sequence ([[]) that does 
not make reference to the tree anymore: it is the sequence (a n ) with do = 1 and 



a n+l 



1 + [a n \ - {a n }' 



Here, [a n \ is the largest integer < a n and {a n } = a n — [a n \ is the "fractional part" of a n . This 
was a solution to a problem raised by Donald Knuth in the American Mathematical Monthly, 
see [9]. 

For more details, and further interesting developments in directions not touched upon in this 
paper, see [3], [5J, and [TTj . 

We wish to look upon the Calkin- Wilf tree from another point of view: that of Mobius 
transformations. Recall that the group of Mobius transformations over a field K is the group 
PGL 2 (-fr) = GL 2 (K)/K X . We introduce the following notation: 



a b 
c d 



is the element of PGL^-ft') represented by 



a b 
c 



These Mobius transformations operate upon P 1 (i^) = K U {oo} in the well-known way 

a b\ _ az + b 
c d\ ~ cz + d' 

The subgroup PSL 2 (Z) = SL 2 (Z)/{±1} of PGL 2 (Q) has been much investigated, and it op- 
erates transitively on P 1 (Q). A closer look at the rules generating the Calkin- Wilf tree shows 
that if a node is labelled by x G Q>o C P 1 (Q), then its left child is labelled by L(x) and its 
right child by R(x), where 



(2) 



L 



1 
1 1 



and R 



1 1 
1 



These choices may at first glance look arbitrary, but we shall argue in the next section that 
they are not. 



2. The Monoid SL 2 (M ) 

Most of the literature on Mobius transformations deals with groups of them, but here we shall 
be concerned with monoids. Since this term is somewhat ambigous, let us fix a definition: 

Definition 2.1. A monoid is a set M together with a binary operation ■ : M x M — )■ M with 
the following properties: 

(i) it is associative, i.e. x(yz) = (xy)z for any x,y,z G S, and 

(ii) there exists an identity element, i.e. an element e G M such that ex = x = xe for all 
x G M. 
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Such an identity element is necessarily unique. 

As usual in algebra, one can now introduce free monoids. If A is a set (considered as an 
"alphabet"), then the free monoicH ,^{A) generated by A consists of all formal words of finite 
length in the alphabet A. Multiplication is given by concatenation. The empty word is 
allowed and serves as the identity element in JP(A). 

If M is a monoid and A C M a subset, we say that M is free on A or freely generated by A 
if the obvious map &{A) — > M is an isomorphism of monoids; in other words, if each element 
of M can be written in a unique way as a product of elements of A. 

What do free monoids look like? Certainly the free monoid on one element is isomorphic to 
No with addition. The free monoid on two generators is much richer in structure. It is tempting 
to think of it as similar to the free group on two generators; but it is in fact much more rigid. 
Namely: 

Lemma 2.2. Let X = {x\, . . . ,x n } be a finite set with n elements, and set & n = &(X). 
Then any automorphism of J£" n is obtained from a permutation of the x^. 

Proof. Consider X as a subset of #" n . Then an element 7 G J^ n is in X if and only if 7 7^ 1 
and whenever 7 = Se, then at least one of 5, e is equal to 1. Hence any automorphism of & n 
takes X to itself. 

In particular, & n ~ if and only m = n. □ 

An automorphism of ^ n is of course determined by what it does on X, and so we get an 
automorphism Aut J^ n ~ & n , the symmetric group. By contrast, if F n denotes the free group 
on n letters, the automorphism group Aut F n is huge. But the picture becomes clearer when one 
notices that the analogue of Aut F n should not be the group Aut JP n , but the monoid End J£" n , 
which is much larger. 

But now enough abstract algebra; we finally introduce the object announced in the section 
title. As one would expect from the notation, the monoid SL^No) consists of all (2 x 2)-matrices 
with entries in No having determinant one, with matrix multiplication as the monoid operation. 
In other words, SL 2 (No) is the sub-monoid of SL 2 (Z) consisting of all matrices with nonnegative 
entries. Note that the composition 



is injective, so that we can and will view SL 2 (No) as a submonoid of PSL 2 (Z). Hence the 
Mobius transformations L and R introduced above can be viewed as elements of SL 2 (No). 

Proposition 2.3 (Folklore). The monoid SL 2 (Nq) is freely generated by the elements 



(3) 



SL 2 (N ) SL 2 (Z) ->■ PSL 2 (Z) 



(4) 




Proof. We first show that SL 2 (N ) is generated by L and R. So let 




Friends of abstract nonsense will immediately recognize that this is equivalent to the definition in terms of 
an adjoint functor to the forgetful functor to sets that they sure would have proposed. 
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We set £(7) = a + b + c + d and proceed by induction on £(7). It is clear that £(7) > 2, with 
equality if and only if 7 = 1. Hence we may assume that £(7) > 3 and 7 7^ 1. Consider the 
two products in SL 2 (Z): 

By Lemma 12.41 below, (a — c)(6 — d) > 0, so at least one of these is in SL 2 (No). For sake 
of simplicity, assume that L~ lr y e SL 2 (N ), the other cases is treated analogously. Then 
£(L -1 7) < £(7), so by induction hypothesis L _1 7 is a product of L and R. Hence so is 7. 

Now we have proved that L and R generate SL 2 (N )- As to freedom, we show that SL 2 (N ) 
is the disjoint union of the sets {1}, L ■ SL 2 (No) and R ■ SL 2 (No). That it is their union follows 
from the fact already proved (that L and R generate SL 2 (No)), and the disjointness follows by 
contemplating the equations 

L-( a $W I * zndR.( a b ) = ( a + C h+ / 
yc a I \a + cb + dl yc a 1 V c a 

(Just consider the possible order relations between entries.) But this observation gives an 
induction proof on word length for the uniqueness of a word defining an element. □ 

We should remark that L and R do not generate a free group of matrices, nor of Mobius 
transformations. To be more specific, the subgroup of GL 2 (Q) they generate is SL 2 (Z), and 
correspondingly the subgroup of PGL 2 (Q) they generate is PSL 2 (Z). Both groups are well- 
known to contain nontrivial torsion elements. For instance, we have the equations (RL~ 1 R) 2 = 
1 in PSL 2 (Z) and {RL^Rf = 1 in SL 2 (Z). 

Lemma 2.4. Let 

J J) e SL 2 (N„) 

be different from the identity matrix. Then (a — c)(b — d) > 0. 

Proof. Assume that (a — c)(b — d) < 0, i.e. that a — c and b — d are both nonzero and have 
opposite signs. There are two cases. 

The first case is that a > c and d > b. Then a > c + 1 and d > b + 1, whence 

1 = ad - be > (c+ 1)(6 + 1) - be = b + c + 1 > 1, 

so equality has to hold everywhere, and b = c = 0. From ad — be = 1 we get that a — d — 1, 
hence the matrix in question is the identity matrix. 

The second case is that c > a and b > d. Then c > a + 1 and b > d + 1, so that 

— 1 = be — ad > (a + l)(cf + 1) — ad = a + d + 1 > 1, 

contradiction. □ 

We can now reinterpret the Calkin- Wilf tree in a new light: it is the directed Cayley graph 
of SL 2 (No). Let us make this precise. 

Definition 2.5. A directed graph is a quadruple (V,E,s,t), where V and E are sets (of 
"vertices" and "edges", respectively) and s and t are maps E — > V (designating "source" and 
"target"). 
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When we draw (or imagine) a directed graph, we draw a node for each v £ V, and for each 
e £ E an arrow originating in s(e) and ending in t(e). Forgetting the orientations of the arrows 
gives a graph in the usual sense, and we say that a directed graph is a (directed) tree if this 
underlying undirected graph is a tree. 

Definition 2.6. Let M be a monoid and A C M a generating set. The directed Cayley graph 
C(M, A) is the directed graph (V, E, s, t) with V = M and E = M x A, such that s(/i, a) = fi 
and £(//, a) = acfi. 

In less formal terms, the vertices are in bijection with M, and for each \i £ M and each 
a £ A we draw an arrow from /j to a/i. 

Note that if A freely generates M, then C (M, A) is a directed tree where every arrow points 
away from the "root" e £ M. 

When treating Cayley graphs of groups, there is often a nasty ambiguity involved in choosing 
a set of generators. As a consequence, one is mainly interested in properties of the Cayley graph 
that do not depend on the choice of a particular set of generators. Here, however, we are in 
a much nicer situation. Proposition 12.31 gives us an explicit isomorphism between JF 2 and 
SL 2 (No), and from Lemma 12.21 we learn that {L, R} is the only subset that freely generates 
SL 2 (No). In other words, if we want a tree, we have no other choice for our generators. 

Proposition 2.7. Consider SL 2 (No) as a submonoid of the group PSL 2 (Z) ; acting on P : (Q) 
by Mobius transformations. The orbit map 7 1— > 7(1) defines a bijection Q : SL 2 (No) — > Q>o- 

Furthermore, Q defines an isomorphism of directed graphs between the directed Cayley tree 
C(SL 2 (Mo), {L,R}) and the Calkin-Wilf tree. Here we identify the vertex set of the Calkin-Wilf 
tree with Q>o, and we orient each of its edges as pointing away from 1. □ 

This has an amusing simple consequence in terms of Diophantine equations: 

Corollary 2.8. Let p, q be coprime positive integers. Then there exist unique a, b,c,d £ No 
with a + b = p, c + d = q and ad — be = 1 . 

Proof. Set x = p/q. The system of equations given above can be translated into 7(1) = x for 
7 £ SL 2 (N ). ' * ~ ' □ 

3. Injective Families 

We are looking for generalisations of the Calkin-Wilf tree; we first generalise the original con- 
struction in four different respects and then ask ourselves if we get any new examples with 
comparably nice properties. 

(i) Replace 2 by any positive integer n: consider directed trees in which every node has n 
(ordered) children. 

(ii) Replace Q by any number field. 

(iii) Replace the initial value 1 £ P X (Q) by any x £ P 1 ^). 

(iv) Replace the two Mobius transformations L and R by n rational maps f\,...,f r £ K(t). 

These data (i) — (iv) should fit together in the following way: if we label the tree in (i) in 
such a way that the root is labelled x Q , and that if a node is labelled by x £ F l (K), then its 
n children are labelled fi(x), . . . , f n (x), in this order. Then every element x £ P 1 (K) should 
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appear at most once in the tree, and the set of those that do occur should be some "simple" 
subset of P 1 (ii') (in the Calkin- Wilf tree, it would be Q>o which is arguably quite simple). Of 
course, what we mean by "simple" has to become clear in the course of the discussion. 

Let us first consider the tree. The description can be made more conceptual by saying that it 
should be the Cayley tree C(J^(X), X), where X = {xi, . . . , x n } with the Xj pairwise distinct. 
As above, we set & n = &(X), and in addition C{^ n ) short for C(&(X),X). 

Our rational maps should, of course, be nonconstant; hence they should live in the monoid 
M{K) which consists of all nonconstant rational maps / £ K(t), with composition fog 
as multiplication. This may be viewed as a sub-monoid of the monoid of endomorphisms 
EndP^ = Hom^(P^,P^). Here P^ is considered ^-variety. The invertible elements in 
this monoid are precisely the Mobius transformations, so that we get a canonical identification 
M(K) X = PGL^-K"). Note that, since K is infinite, we need not distinguish between a rational 
function as a formal expression and the map F l (K) — > P 1 (i^) it induces. 

Proposition 3.1. For every number field K, the monoid &(K) is infinitely generated. 

Proof. First we show that certain groups are not finitely generated. To begin with, an abelian 
2-torsion group is the same as an F 2 - vector space; hence such an abelian group is finitel 
generated if and only if it is finite. For any number field, the group K x /(K x ) 2 is infinit 
Hence it is infinitely generated, and therefore also the group PGL 2 (i^), which surjects onto it, 
must be infinitely generated. 

But from this it follows that M{K) cannot be finitely generated. Suppose it were, say 
generated by /i, . . . , f r , gi, . . . , g s with deg fi = l and degg^ > 1. Since deg(<£> o ip) — deg ip ■ 
deg?/;, we see that any composition containing at least one must have degree > 1. So the 
monoid (and hence also the group) PGL 2 (i^) must be generated by fi, . . . , f r , which we have 
just seen to be impossible. □ 

It is all the more astonishing that we can express all / £ M(K) as compositions of just two 
admittedly strange maps P 1 (iT) — > F l (K). 

Theorem 3.2 (Sierpihski). Let A be an infinite set, and let ^(A) be the monoid of all maps 
A A, with composition of maps as monoid composition. Let X C ^f(A) be any countable 
subset. Then there exist elements (p, ip £ ^f(A) such that X is contained in the submonoid of 
^(A) generated by (p and ip. □ 

This Theorem was first proved in |14j : shortly afterwards, Banach gave a very elegant proof, 
see [3]. 

Corollary 3.3. For any countable field K , there exist two maps ip, ip from F l (K) = K U {oo} 
to itself such that every nonconstant rational map P 1 (K) — > P 1 (K) can be written as a finite 
composition involving only ip and ip. 

Proof. Apply Theorem O to A = P^iT) and X = M{K). □ 



4 This can be seen, for instance, as follows: By Dirichlet's density theorem, see [TUJ Chapter VII, Theorem 
13.2], there are infinitely many prime ideals in the ring of integers Ok which are principal ideals. Let these be 
pi,p2 etc., and let pk be a generator of pk- Then the elements pi,P2 etc. are all distinct modulo (K x ) 2 . 
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So having chosen rational maps fa,...,fa, we consider the unique morphism of monoids 
h : & n —> M(K) with h(xi) = fa then our tree is the Cayley tree C(JF n ), where the node 
corresponding to 7 G & n is labelled by h(j)(xo). This defines an "evaluation" map 

(5) n:^ n ^P\K), j^h^)(x ). 

Definition 3.4. Let K be a number field, let x G P x (i^) and let fa, ■ ■ ■ , fa G M(K). The 
family (fa, ... , fa) G M(K) n is called injective at x if the map Q as in ^ is injective. 

Clearly, a family (fa, ■ ■ ■ , fa) G M(K) n is injective at xq if and only if the fi generate a 
free submonoid T C M(K) and the orbit map V — > ^(K) sending 7 to 7(^0) is injective. By 
conjugating with a suitable Mobius transformation, we can always assume that Xq = 1. 

Some interesting injective families over Q, all of whose members are Mobius transformations, 
have been found by S.H. Chan, see [7j. These give rather forests with a finite number of 
components, instead of isolated trees. For the reader's convenience, we describe them in our 
terms. 

For every integer k > 2, a family % is defined by consisting of these 2k Mobius transforma- 
tions: 



2 1 

3 2 



k-1 

k 



k 
k 



k k-1 
k k 



k k k-1 k 2312 
k-1 k\ ' [k - 2 k — l\ ' " ' ' [l 2J ' [0 1 

It is injective on each of the initial values x±, . . . , x<ik-\ given by 

12 k-1 k k 3 2 

2' 3 

Furthermore, the orbits T(xi), . . . , l yx2k-i) axe disjoint and tneir union is <y?>o- 
proved in [7J Theorem 4]. 

There is a similar infinite family of injective families; they enumerate the slightly more 
complicated set Q^f 1 °f a ^ positive rational numbers | with p, q coprime and pq even. For 
every integer k > 1, let M\. be the family of Ik + 1 Mobius transformations: 

1 0' 

2 1 



"■' k 'jfe'jfe-l'"''2'l' 

r(x2fc_i) are disjoint and their union is 



All this is 



"2 1" 






k k-1 




'k + 1 k 


3 2 


j ... j 


k + 1 


k 




k 


k + 1 


k 


k - 


1 




"2 3" 




1 2" 




k + 1 k 




j ... j 


1 2 


1 


1 





It is injective on each of the initial values y\ 

1 2 k 
2'3'"''fc + l' k 

The orbits r(?/i), . . . , T(y 2 k) are disjoint and their union is 



. , y 2k given as 
k + 1 3 2 

'■■■'2' r 



Ticvcn 
2>0 



This can be found in [7J 
Theorems 2 and 5]. Theorem 2 in op. cit. is followed by a detailed discussion of the simplest 
case k = 1. 

Similar to the interpretation of the denominators and numerators of the Calkin- Wilf sequence 
as a combinatorial function, there are further combinatorial interpretations of these forests in 
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4. Heights on P 1 and the Distribution of Points 



Let K be a number field. A place of K is an equivalence class of valuations; denote the set of 
all places of K by g?{K). If p is a place of K, write K p for the corresponding completion. For 
every place p we choose a representing valuation | • \ p : K — > [0, oo) in the following way: 

(i) If p is real, there is a unique isomorphism of fields K p ~ M, and we pull back along this 
isomorphism the usual absolute value \x\ = max(x, — x) on the reals. 

(ii) If p is complex, there are two isomorphisms r, r : K p ~ C of topological fields, and we 
set \x\ p = t(x)t(x). 

(iii) If p is non-archimedean, let q be the cardinality of the corresponding residue class field. 
Let 7r G K be a uniformising element; we normalise | • | p in such a way that \n\ p = -. 

With these normalisations, we have the famous product formula, see (TOj Chapter III, Propo- 
sition 1.3]: for any x G K x , all but a finite number of the | equal to 1, and 



As a consequence, the following construction gives a well-defined function on F n (K) which can 
be thought of as measuring the arithmetic complexity of a point. 

Definition 4.1. Let K be a number field of degree d and let x G F n (K). Choose Xq, . . . , x n G K 
such that x = (x : ■ • • : x n ); the (absolute) height of x is the real number 



We always have H(x) > 1 and therefore h(x) > 0, with equality if and only if x is a root of 
unity, see [151 Theorem 3.8]. The absolute height is defined in such a way that the functions 
H: P 1 (i^) — > [1, oo) for varying K glue together to H: P n (Q) — > [1, oo), similarly for h. 

For K — Q, there is a description of the height which is much more intuitive and makes 
computations much easier: if x G P n (Q), we can write it as x = (x : ■ • • : x n ) with Xq, . . . ,x n G 
Z coprime. Then 

(9) H(x) = m&x(\x \ oo ,...,\x n \ oo ). 

Here, of course, | • |oo is the usual absolute value on Z C R, i.e. |a|oo = max(a, —a). 

We now examine how H(f(x)) relates to H(x), where / is a rational function. First we 
consider the case of Mobius transforms. By identifying the matrix entries with coordinates, 
we can view GL2(K) as a subset of K 4 . This is compatible with the action of K x , on the 
matrix group by multiplication with scalar matrices, and on the linear space by multiplication 
with scalars. So we can view PGL 2 (i^) = GL 2 (K)/K X as a subset of ¥ 3 (K) and define the 
height of an element of PGL2(i^) as the height of the corresponding point in P 3 (i^). By the 
simple description of heights for K = Q, we get an equally simple description of the height of 



(6) 



n i4> =i - 



pe&>(K) 





The (absolute) logarithmic height of x is the real number 
(8) h(x) =logH{x). 
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an element 7 £ PGL 2 (Q): represent 7 by a matrix 

: s) - ^ 

with a, 6, c, d £ Z having greatest common divisor 1. Then 

if (7) = H((a :b:c:d)) = max.{\a\ 00 , |6|oo, |c|oo, Moo)- 

Lemma 4.2. Lei K be a number field of degree d and 7, 5 £ PGL^.K'). TTien if (7) = ff (7 _1 ). 

Proof. If 7 £ PGL 2 (fT) is represented by the matrix A, then 7 -1 is represented by the matrix 
A^ 1 = (det A)~ 1 y4' ) , where the matrix A^ is obtained by permuting the entries of A in a well- 
known fashion and multiplying two of them with —1. But by the definition of PGL2, we see 
that 7 _1 is also represented by A\ whence if (7) = ff (7~ 1 ). □ 

Proposition 4.3. Let K be a number field of degree d, let x £ ¥ l (K) and 7 £ PGL 2 (fT). 
Then 

^r^H(x) < if ( T (x)) < 2H( 1 )H(x). 

Proof. We only need to show the second inequality; the first will follow by replacing 7 by 7 -1 
and using Lemma 14.21 So choose a representative matrix 

\ % £ GL 2 (iif) 
for 7. Write x = (xq : X\). Then for any place p of K we get 

max(|ax + faa| p , \cx + dxi\ p ) < t p ■ max(|a| p , |6| p , |6| p , \d\ p ) ■ max(|x | p , \xi\p) 

by the triangle inequality; here t p is 1 if p is non-archimedean, 2 if p is real and 4 if p is complex. 
Taking the product over all p and then taking d-th roots yields the desired result. □ 

Thus Mobius transformations can only change the height by a multiplicative factor. With 
some more effort, one obtains the following special case of [151 Theorem 3.11]: 

Theorem 4.4. Let K be a number field and f £ M(K) a rational map of degree d. Then 
there exist constants ci, c 2 > such that for all x £ F l (K), 

d ■ H(x) d < H(f(x)) < c 2 • H(x) d . 

We now turn to estimating points in a fixed field of bounded height. 

Theorem 4.5. We have the following asymptotics as N — >■ 00: 

12 

cardix £ P X (Q) | H{x) < N} = —N 2 + O(NlogN). 

Proof. This is classical and can, up to reformulation into more elementary language, be found 
in [2], in the proof of Theorem 3.9. □ 

For other number fields, there is a similar estimate: 
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Theorem 4.6 (Schanuel). Let K be a number field of degree dx > 1. Then for N — > oo we 
have 

card{x G F l (K) | H{x) < N} = c K ■ N 2dl< + C^iV 2 ^ 1 ), 

with the constant 

2 2ri+r2 - 1 (2vr) r2 Res s= i ( K {s) _ h K ■ R K ■ 2 3ri+r2 ~ 1 ■ (2vr) 2r2 
° K ~ y/\K^\ OA2) ~ w K -\A K \-( K (2) ' 

Here, as usual, r 1 is the number of real places, r 2 the number of complex places, A K the 
discriminant, ( K the Dedekind zeta function, h K the class number, Rk the regulator and wk 
the number of roots of unity in K . 



Proof. This is a special case of the main result in |13j ; the equality of the two expressions for 
ck follows from the class number formula. Note that Schanuel uses a different normalisation 
for the height, whence the different exponent. □ 

Note that for K = Q, the formula for ck gives 12/7T 2 , as above; the only reason that we 
have to treat this case seperately is that the error term has a different shape. And, of course, 
Theorem 14.51 is much more elementary than Theorem 14.61 

The notion of height helps us to measure the "size" of a subset A C P 1 (ii'). 

Definition 4.7. Let K be a number field and A C ^(K). Its lower height density is the 
number 



5~ h {A) = liminf card{ ^ € F1{K) | H{x) < N} e [o, i]; 

its upper height density is the number 

s +/ as i- card{a; G A \ H(x) < N} 

5+ (A) = hmsup —T7-T- ' , W? \n G t ' ^ 



If these two are equal, we say that 'A has a height density" and call the quantity Sh(A) = 
Sfr(A) = Sfi(A) the height density of A. 

By Theorems 14.51 and 14.61 we see that A has a height density if and only if the limit 

cardix G A I H(x) < N} 
lim ^75 

exists, and the height density is then this limit divided by the constant ck- 
We now give some examples for height density. 

(i) If K is given as a subfield of M, then the set of all positive x G K has height density |. 
This is because H(x) = H(—x). 

(ii) If K is a number field of degree d, 7 G PGL^-ft') is a Mobius transformation and 
A C P 1 (i^) is any subset, then 

6hil{A)) - JmW d and 6til{A)) - (2jff(7))2 ^ +(A) - 

This follows from Proposition 14.31 together with the observation that the number of 
points of height below iV grows like N 2d . In particular if A has nonzero lower height 
density, then so has 7(A). 
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(iii) Combining the two previous examples, we see: if K C R is a number field and a < b, 
then the subset K D [a, b] C P 1 (ii') has positive lower height density, since there exists 
a Mobius transformation in PGL^-ft') which maps [0, oo) into [a, b}. 

(iv) The set Q^f 1 introduced before has positive height density in P 1 (Q). This can be 
seen as follows. Let us estimate the number of pairs (p, q) G N 2 with p, q coprime, 
q < p < N and p even. If we can show that this number is bounded below by some 
positive constant times N 2 , we are done. 

Now this number is equal to 

LA72J /AA 2 

v{p)> £ Ki) = 5> (n) = ^' y +o(N\o gN ). 

Kp<N Kp<N n=l ^ ' 

p even p even 

The first inequality is derived from the elementary inequality (p(2n) > y?(n), and the 
final equality follows from [21 Theorem 3.7]. 



5. Constraints on Injective Families 

In this final section we shall show that if an injective family consists only of maps of degree 
at least two, then its image in IP 1 (If) must have height density zero. So to get started, assume 
that K is a number field and (fi, . . . , /„) G M(K) n is an injective family for some initial value 
Xq G P 1 (f^), where deg/j > 2 for all i. Denote by T the free monoid generated by the fi in 
&(K), and let ||7|| be the word norm on T. That is, for 7 = f^f^ ■ ■ ■ fi r set ||7|| = r. 

We prefer to work with logarithmic heights in this section. By Theorem 14.41 we find a 
constant c > such that for all 1 < % < n and all x G P 1 (fC), the inequality 

(10) h(fi(x))>2h(x)-c 

holds. By replacing c with a larger constant if necessary, we may also assume that 

c > 1. 

Hence T "explodes" heights outside the exceptional set 

S = {x G F\K) I h(x) < 2c}. 
By Theorem 14.51 or I4TBI depending on whether K = Q or not, this is a finite set. 

Lemma 5.1. Under these assumptions, every element of T takes the complement of S to 
itself. In formulas: 

(11) T{F\K) \ S) C F\K) \ S. 
Furthermore, for any x G P 1 (i^) \ S and 7 G T we have the inequality 

(12) %(*))> (jj) •*(*)■ 

Proof. Let x be in the complement of 5*, i.e. /i(x) > 2c. Then from (1101) . we obtain 

h(/i(z)) > 2/l ( s ) - c> 4c - c> 2c. 
In particular, /j(x) ^ 5". Since the fi generate T, this shows the first part. 
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The second inequality also needs only to be checked for 7 = fa or, equivalently, ||7|| = 1. But 
using that c < \h{x), we find that 

h(fi(x)) > 2h(x) - c > 2h(x) - -h(x) = -h(x), 

which is just what is to be proved for ||7|| = 1. □ 

If one enlarges 5* suitably, the estimate can of course be sharpened in such a way that the 
constant | can be replaced by any 2 — e with e > 0. 

Because the orbit map 7 i-> 7(x ) is injective, it can hit S only up to a finite word length. 
So there exists some Uq G N with the property that whenever ||7|| > n , then 7(^0) ^ S (and 
consequently /i(7(x )) > 2c). 

Lemma 5.2. Let 7 G T with ||7|| > uq. Then 

\h\\-no 



%Oo)) > ( 2 



Proof. Set = ||7|| — n . Write 7 = 7172 with ||7i|| = and 1 1 ^y 2 1 1 = n o- Then 
h(rf{x )) = ^(71(72(^0))) > (f) " 7 " • h(rf 2 (x )) > (I) ■ 2c > 



The ">" sign is obtained from Lemma T5.14 setting x = 72(^0) ^ S (by assumption on 72). The 
first ">" is justified again by the observation that 72(^0) ^ S and the definition of S. The 
second ">" sign finally is justified by c > 1 (remember we made it that way). □ 

Proposition 5.3. Under the above assumptions, there exist constants d > and k G N such 
that for all sufficiently big positive reals B one has 

(13) card{ 7 G Y \ h(^r(x )) < B} < c! ■ B k . 

Proof. Since V is free on r generators, we get that 

card{7 G V \ || 7 || < C} = ^ r v < r c+1 

if r > 2; for r = 1 we get the even simpler estimate \C\ + 1 that will also do the job. We 
assume from now on that r > 2 since the calculation for r = 1 is even easier. 



By Lemma I5.2[ we find 

" -no 

2 



card{7 G T | %(x )) < B} < card{ 7 G T \ I - ) < B} 



3 

card{7 G T \ (\\^\\ - n ) log - < logfi} 



card{7 G T \ < n + - 3 } 



log B 
log _ 

^ r n +log_B/log| + l _ r n +l . ^logr/logf 
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so that setting d = r no+l and k = [log rj log |] will yield the desired estimate. □ 

Theorem 5.4. Let K be a number field and (/i, . . . , / n ) G !%(K) n an injective family for the 
initial value xq G P 1 (i^). Assume that deg/j > 2 for all 1 < % < n. Let T C &(K) be the 
submonoid generated by the fi. Then the image T(xq) C ¥ \K) has height density zero. 

Proof. We translate the previous considerations back from statements about logarithmic 
heights into statements about heights. Since H(x) < N if and only if h(x) < logiV, we 
see from Proposition 15.31 that there exists a positive integer k with 

card{x G r(x ) | H(x) < N} = 0((\ogN) k ). 

Comparing this with Theorems 14 . 5 1 and 14 . 6 [ we see that T(xq) must have height density zero. □ 

We have seen before that in the case K = Q, for every n > 2 there exists an injective family 
whose orbit has positive height density and which consists of n Mobius transformations. It is 
easy to see that we cannot get positive height density for a family consisting of just one Mobius 
transformation. Note, however, that Newman's map 

1 



being not terribly far apart from a Mobius transformation, gives an "injective family" with just 
one element, whose orbit Q>o has height density ~. 

The last theorem tells us that we cannot get positive height density if we only work with 
maps of higher degree. So there remain two open questions: what about the mixed case, i.e. 
injective families consisting of both Mobius transformations and higher degree maps, and what 
about Mobius transformations in general number fields? 

We conjecture that the condition "deg fi > 2 for all i" in Theorem 15.41 can be relaxed to 
the weaker condition "deg/j > 2 for at least one z". In other words, that if the orbit of 
an injective family has positive upper height density, then the family must consist entirely 
of Mobius transformations. Note that then the injectivity of the family would be a crucial 
condition since otherwise we could just add some higher degree maps to the Calkin- Wilf family. 
As to the second question, there might be interesting trees similar to the Calkin- Wilf tree 
already over quadratic number fields. 
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